Kalman filter
The Kalman filter (named after its inventor, Rudolf E. Kalman) is an efficient recursive computational solution for tracking a time-dependent state vector with noisy equations of motion in real time by the least-squares method. It is used to separate signal from noise so as to optimally predict changes in a modeled system with time. Kalman filtering is an important topic in control theory and control systems engineering. It is in a wide range of engineering applications from radar to computer vision. The filter was developed in papers by Swerling (1958), Kalman (1960) and Kalman and Bucy (1961). What makes the Kalman filter particularly unique is that it is purely a time domain filter. Most filters (for example, a low-pass filter) are formulated in the frequency domain and then transformed back to the time domain for implementation. Peter Swerling actually developed a similar algorithm earlier. Stanley Schmidt is generally credited with developing the first implementation of a Kalman filter. It was during a visit of Kalman to the NASA Ames Research Center that he saw the applicability of his ideas to the problem of trajectory estimation for the Apollo program, leading to its incorporation in the Apollo navigation computer. A wide variety of Kalman filters have now been developed, from Kalman's original formulation, now called the simple Kalman filter, to Schmidt's extended filter, the information filter, and a variety of square-root filters, developed by Bierman, Thornton and many others. Perhaps the most commonly used type of Kalman filter is the phase-locked loop now ubiquitous in radios, computers, and nearly any other type of video or communications equipment. Kalman filter basics The Kalman filter is used to estimate the state of a dynamic system from a series of noisy measurements. In order to accomplish this the Kalman filter employs 3 (sometimes 4) models. The first model is called the state transition model. This model describes how the state is expected to change from one timestep to the next. This is necessary because the system is dynamic, which means its underlying state is always subject to change. In reality this model is an approximation to the true process which is driving the dynamics of system. Because the state transition model is an approximation, a second model called the process noise model, is used. This model is used to mask the errors caused by the approximation. The third model is called the observation model. This model describes how the estimation space maps into the observation space. For example, an estimate of a target being tracked may include the target's velocity even though only the target's position is measured, or perhaps the target's position is measured in polar co-ordinates, but the estimated position in cartesian co-ordinates. The last model describes how the state is expected to change in response to a given control input (the control-input model). Often the latter is not used, for example in target tracking applications, as the control inputs are obviously unknown in this case. As with the state transition model, this is generally an approximation. The Kalman filter is a recursive estimator. This means that only estimated state from the previous time step and the current measurement are needed to compute the estimate for the current state. In contrast to batch estimation techniques, no history of observations and/or measurements are required. It has two phases: Predict and Update. The predict phase uses the estimate from the previous timestep to produce an estimate of the current state. In the update phase measurement information from the current timestep is used to refine this prediction to arrive at a new, more accurate (hopefully) estimate. The Kalman filter assumes a linear state space whose state transition is of the following form \textbf{x}_{k} = \textbf{F}_{k} \textbf{x}_{k-1} + \textbf{B}_{k}\textbf{u}_{k} + \textbf{w}_{k} The state transition model \textbf{F} is applied to \textbf{x}_{k-1} the previous state. The process noise \textbf{w}_{k} is the part of the state transition which is not modelled by the state transition model. The presence of process noise requires the use of a process noise model in the filter. The process noise at time k'' is assumed to be gaussian white noise with covariance \textbf{Q}_{k} \delta(k-j) = E\textbf{w}_{j}^{T} . The control input, \textbf{u}_{k} is mapped into the estimation space by \textbf{B}_{k} the control-input model. The observation, assumed to be of the form \textbf{z}_{k} = \textbf{H}_{k} \textbf{x}_{k} + \textbf{v}_{k} is a function of the true state where \textbf{H}_{k} is the observation model and \textbf{v}_{k} is the observation noise at time ''k which is assumed to be gaussian white noise with covariance \textbf{R}_{k} \delta(k-j) = E\textbf{v}_{j}^{T} Kalman filter equations The Kalman filter is used to obtain an estimate of the \hat{\textbf{x}}_{k|k} of the true state \textbf{x}_{k} using only measurements \textbf{z}_{i} \; \forall i\in(0 ... k) and control inputs \textbf{u}_{i} \; \forall i\in(0 ... k) . Predict \hat{\textbf{x}}_{k|k-1} = \textbf{F}_{k}\hat{\textbf{x}}_{k-1|k-1} + \textbf{B}_{k} \textbf{u}_{k} \hat{\textbf{P}}_{k|k-1} = \textbf{F}_{k} \hat{\textbf{P}}_{k-1|k-1} \textbf{F}_{k}^{T} + \textbf{Q}_{k} Update \textbf{K}_{k} = \hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T}(\textbf{H}_{k}\hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T} + \textbf{R}_{k})^{-1} \hat{\textbf{x}}_{k|k} = \hat{\textbf{x}}_{k|k-1} + \textbf{K}_{k}(\textbf{z}_{k} - \textbf{H}_{k}\hat{\textbf{x}}_{k|k-1}) \hat{\textbf{P}}_{k|k} = (I - \textbf{K}_{k} \textbf{H}_{k})\hat{\textbf{P}}_{k|k-1} Numerically stable update The covariance update equation assumes the gain is computed exactly as \textbf{K}_{k} = \hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T}(\textbf{H}_{k}\hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T} + \textbf{R}_{k})^{-1} If due to computational error the gain is inexact the following gives greater stability \hat{\textbf{P}}_{k|k} = (I - \textbf{K}_{k} \textbf{H}_{k})\hat{\textbf{P}}_{k|k-1}(I - \textbf{K}_{k} \textbf{H}_{k})^{T} + \textbf{K}_{k} \textbf{R}_{k}\textbf{K}_{k}^{T} Optimality In the linear, gaussian case where the same linear models are used to drive the true and estimated states and where the noise parameters used in the filter are matched to the true noises perturbing the underlying state and observations, then the estimated covariance is a faithful representation of the true covariance of the state, that is, \hat{\textbf{P}_{k|k}} = \textbf{P}_{k} = \textbf{E}\{(\textbf{x}_{k} - \hat{\textbf{x}}_{k|k})(\textbf{x}_{k} - \hat{\textbf{x}}_{k|k})^{T}\} Similary the following also holds this case \hat{\textbf{P}_{k|k-1}} = \textbf{E}\{(\textbf{x}_{k} - \hat{\textbf{x}}_{k|k-1})(\textbf{x}_{k} - \hat{\textbf{x}}_{k|k-1})^{T}\} In this case the Kalman filter is an optimal estimator in a least squares sense of the true state. Example Tracking a 1D particle which is being perturbed with gaussian accelerations using a piece-wise constant process model. (Note: Time indices for F'', ''G, R'' and ''Q have been dropped.) The position and velocity of a point particle is described by the linear state space \textbf{x}_{k} = \begin{bmatrix} X, \dot{X} \end{bmatrix}^{T} where \dot{X} is the velocity, that is, the derivative of position. Between the (k'' − 1)th and ''k''th timestep the particle undergoes an acceleration \textbf{w}_{k} , N( \textbf{w}_{k}, 0, \textbf{Q} ) . The updated position and velocity \textbf{x}_{k} = \textbf{F} \textbf{x}_{k-1} + \textbf{G w}_{k} where \textbf{F} = \begin{bmatrix} 1 & T \\ 0 & 1 \end{bmatrix} and \textbf{G} = \begin{bmatrix} \begin{matrix} \frac{T^{2}}{2} \end{matrix} , T \end{bmatrix}^{T} follows logically from the Newtonian equations of motion. ''T is the time difference between the (k'' − 1)th and ''k''th time step. (N.B. This process model requires that ''T is constant.) At each time step, a noisy measurement of the true position of the particle is made. \textbf{z}_{k} = \textbf{H x}_{k} + \textbf{v}_{k} where N( \textbf{v}_{k}, 0, \textbf{R} ) and \textbf{H} = \begin{bmatrix} 1, 0 \end{bmatrix} is the observation model. Using these measurements an estimate of the state can be computed. The predicted state at the k''th timestep using the estimated state from the (''k − 1)th timestep is \hat{\textbf{x}}_{k|k-1} = \textbf{F} \hat{\textbf{x}}_{k-1|k-1} and the predicted covariance is \hat{\textbf{P}}_{k|k-1} = \textbf{F}\hat{\textbf{P}}_{k-1|k-1}\textbf{F}^{T} + \textbf{GQG}^{T} The predicted state and covariance are updated with the measurement and measurement (co)variance. The measurement innovation (or residual) \tilde{\textbf{y}}_{k} = \textbf{z}_{k} - \textbf{H}\hat{\textbf{x}}_{k} is the difference between the actual and predicted measurements, while the innovation (residual) covariance \textbf{S}_{k} = \textbf{H}\hat{\textbf{P}}_{k|k-1}\textbf{H}^{T} + \textbf{R} is the sum of the predicted covariance and the measurement covariance. The Kalman gain \textbf{K}_{k} = \hat{\textbf{P}}_{k|k-1}\textbf{H}^{T}\textbf{S}_{k}^{-1} is the ratio between the predicted covariance and the residual covariance. The updated state estimate \hat{\textbf{x}}_{k|k} = \hat{\textbf{x}}_{k|k-1} + \textbf{K}_{k} \tilde{\textbf{y}_{k}} is the predicted state plus the measurement innovation weighted by the Kalman gain. The updated estimated covariance is \hat{\textbf{P}}_{k|k} = \chi_{k}\hat{\textbf{P}}_{k|k-1}\chi_{k}^{T} + \textbf{K}_{k} \textbf{RK}_{k}^{T} where \chi_{k} = I - \textbf{K}_{k} \textbf{H}.\, The Kalman gain will converge to a steady-state position if Q and R are time-invariant. The steady-state Kalman-gain can then be precomputed. This will reduce the Kalman-filter to an ordinary observer; which is computationally simpler. Derivation The Kalman filter can be derived in several ways. The one presented here uses probability theory. The true state is assumed to be an unobserved Markov process, and the measurements are the observed states of a hidden Markov model. Because of the Markov assumption, the true state is conditionally dependant only on the previous state and is independent of all previous states. p(\textbf{x}_k|\textbf{x}_0,...,\textbf{x}_{k-1}) = p(\textbf{x}_k|\textbf{x}_{k-1}) Similarly the measurement a the k''-th timestep is dependent only upon the current state and is independent of all other states. p(\textbf{z}_k|\textbf{x}_0,...,\textbf{x}_{k}) = p(\textbf{z}_k|\textbf{x}_{k} ) Using these assumptions the probability distribution over all states of the HMM can be written simply as: p(\textbf{x}_0,...,\textbf{x}_k,\textbf{z}_1,...,\textbf{z}_k) = p(\textbf{x}_0)\prod_{i=1}^k p(\textbf{z}_i|\textbf{x}_i)p(\textbf{x}_i|\textbf{x}_{i-1}) However, when the Kalman filter to estimate the state '''x' the probability distribution of interest is that associated with the current states conditioned on the measurements upto the current timestep. (This is achieved by marginalising out the previous states and dividing by the probability of the measurement set.) This leads to the predict and update steps of the Kalman filter written probabilistically. The probability distribution associated with the predicted state is product of the probability distribution associated with the transition from the (k'' - 1)th timestep to the ''k''th and the probability distribution associated with the previous state, with the true state at (''k - 1) integrated out. p(\textbf{x}_k|\textbf{Z}_{k-1}) = \int p(\textbf{x}_k | \textbf{x}_{k-1}) p(\textbf{x}_{k-1} | \textbf{Z}_{k-1} ) \, d\textbf{x}_{k-1} The measurement set upto time t'' is \textbf{Z}_{t} = \left \{ \textbf{z}_{1},...,\textbf{z}_{t} \right \} The probability distribution of updated is proportional to the product of the measurement likelihood and the predicted state. p(\textbf{x}_k|\textbf{Z}_{k}) = \frac{p(\textbf{z}_k|\textbf{x}_k) p(\textbf{x}_k|\textbf{Z}_{k-1})}{p(\textbf{z}_k|\textbf{Z}_{k-1})} The denominator p(\textbf{z}_k|\textbf{Z}_{k-1}) = \int p(\textbf{z}_k|\textbf{x}_k) p(\textbf{x}_k|\textbf{Z}_{k-1}) d\textbf{x}_k is an unimportant normalisation term. The remaining probability density functions are PDF: p(\textbf{x}_k | \textbf{x}_{k-1}) = N(\textbf{x}_k, \textbf{F}_k\textbf{x}_{k-1}, \textbf{Q}_k) p(\textbf{z}_k|\textbf{x}_k) = N(\textbf{z}_k,\textbf{H}_{k}\textbf{x}_k, \textbf{R}_k) p(\textbf{x}_{k-1}|\textbf{Z}_{k-1}) = N(\textbf{x}_{k-1},\hat{\textbf{x}}_{k-1},\textbf{P}_{k-1} ) Note that the PDF at the previous timestep is inductively assumed to be the estimated state and covariance. This is justified because, as an optimal estimator, the Kalman filter makes best use of the measurements, therefore the PDF for \mathbf{x}_k given the measurements \mathbf{Z}_k ''is the Kalman filter estimate. Information filter In the Information filter, or Inverse Convariance filter, the estimated covariance and estimated state are replaced by the information matrix and information vector respectively. \hat{\textbf{Y}}_{k|k} \equiv \hat{\textbf{P}}_{k|k}^{-1} \hat{\textbf{y}}_{k|k} \equiv \hat{\textbf{P}}_{k|k}^{-1}\hat{\textbf{x}}_{k|k} Similarly the predicted covariance and state have equivalent information forms, \hat{\textbf{Y}}_{k|k-1} \equiv \hat{\textbf{P}}_{k|k-1}^{-1} \hat{\textbf{y}}_{k|k-1} \equiv \hat{\textbf{P}}_{k|k-1}^{-1}\hat{\textbf{x}}_{k|k-1} as have the measurement covariance and measurement vector. \textbf{I}_{k} \equiv \textbf{H}_{k}^{T} \textbf{R}_{k}^{-1} \textbf{H}_{k} \textbf{i}_{k} \equiv \textbf{H}_{k}^{T} \textbf{R}_{k}^{-1} \textbf{z}_{k} The information update now becomes a trivial sum. \hat{\textbf{Y}}_{k|k} = \hat{\textbf{Y}}_{k|k-1} + \textbf{I}_{k} \hat{\textbf{y}}_{k|k} = \hat{\textbf{y}}_{k|k-1} + \textbf{i}_{k} The main advantage of the information filter is that N'' measurements can be filtered at each timestep simply by summing their information matrices and vectors. \hat{\textbf{Y}}_{k|k} = \hat{\textbf{Y}}_{k|k-1} + \sum_{j=1}^N \textbf{I}_{k,j} \hat{\textbf{y}}_{k|k} = \hat{\textbf{y}}_{k|k-1} + \sum_{j=1}^N \textbf{i}_{k,j} To predict the information filter the information matrix and vector can be converted back to their state space equivalents, or alternatively the information space prediction can be used. \textbf{M}_{k} = \textbf{F}_{k}^{-1}^{T} \hat{\textbf{Y}}_{k|k} \textbf{F}_{k}^{-1} \textbf{C}_{k} = \textbf{M}_{k} \textbf{M}_{k}+\textbf{Q}_{k}^{-1}^{-1} \textbf{L}_{k} = I - \textbf{C}_{k} \hat{\textbf{Y}}_{k|k-1} = \textbf{L}_{k} \textbf{M}_{k} \textbf{L}_{k}^{T} + \textbf{C}_{k} \textbf{Q}_{k}^{-1} \textbf{C}_{k}^{T} \hat{\textbf{y}}_{k|k-1} = \textbf{L}_{k} \textbf{F}_{k}^{-1}^{T} \hat{\textbf{y}}_{k|k} Note that if ''F and Q'' are time invariant these values can be cached. Note also that ''F and Q'' need to be invertible. Non-linear filters The basic Kalman filter is limited to a linear assumption. However most non-trivial systems are non-linear. The non-linearity can be associated either with the process model or with the observation model or with both. Extended Kalman filter In the Extended Kalman filter (EKF) the state transition and observation models need not be linear functions of the state but may instead be (differentiable) functions. \textbf{x}_{k} = f(\textbf{x}_{k-1}, \textbf{u}_{k}, \textbf{w}_{k}) \textbf{z}_{k} = h(\textbf{x}_{k}, \textbf{v}_{k}) The function ''f can be used to compute the predicted state from the previous estimate and similarly the function h'' can be used to compute the predicted measurement from the predicted state. However ''f and h'' cannot be applied to the covariance directly. Instead a matrix of partial derivatives (the Jacobian) is computed. \textbf{F}_{k} = \left . \frac{\partial f}{\partial \textbf{x} } \right \vert _{\hat{\textbf{x}}_{k|k-1},\textbf{u}_{k}} \textbf{H}_{k} = \left . \frac{\partial h}{\partial \textbf{x} } \right \vert _{\hat{\textbf{x}}_{k|k-1}} At each timestep the Jacobian is evaluated with current predicted states. These matrices can be used in the Kalman filter equations. This process essentially linearises the non-linear function around the current estimate. This results in the following extended Kalman filter equations: '''Predict' \hat{\textbf{x}}_{k|k-1} = f(\textbf{x}_{k-1}, \textbf{u}_{k}, 0) \hat{\textbf{P}}_{k|k-1} = \textbf{F}_{k} \hat{\textbf{P}}_{k-1|k-1} \textbf{F}_{k}^{T} + \textbf{Q}_{k} Update \textbf{K}_{k} = \hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T}(\textbf{H}_{k}\hat{\textbf{P}}_{k|k-1}\textbf{H}_{k}^{T} + \textbf{R}_{k})^{-1} \hat{\textbf{x}}_{k|k} = \hat{\textbf{x}}_{k|k-1} + \textbf{K}_{k}(\textbf{z}_{k} - h(\textbf{x}_{k}, 0)) \hat{\textbf{P}}_{k|k} = (I - \textbf{K}_{k} \textbf{H}_{k})\hat{\textbf{P}}_{k|k-1} Unscented Kalman filter The Extended Kalman filter gives particularly poor performance on highly non-linear functions because only the mean is propagated through the non-linearity. The Unscented Kalman filter (UKF) JU97 uses a deterministic sampling technique to pick a minimal set of sample points (called sigma points) around the mean. These sigma points are then propagated through the non-linear functions and the estimated covariance is then recovered. The result is a filter which more accurately captures the true mean and covariance. (This can be verified using Monte Carlo sampling.) In addition, this technique removes the requirement to calculate Jacobians, which for complex functions can be a difficult task in-itself. Predict As with the EFK, the UKF prediction can be used independently from the UKF update, in combination with a linear (or indeed EKF) update, or visa versa. The estimated state and covariance are augmented with the mean and covariance of the process noise. \textbf{x}_{k-1|k-1}^{a} = [ \hat{\textbf{x}}_{k-1|k-1}^{T} \quad E\textbf{w}_{k}^{T} \ ]^{T} \textbf{P}_{k-1|k-1}^{a} = \begin{bmatrix} & \hat{\textbf{P}}_{k-1|k-1} & & 0 & \\ & 0 & &\textbf{Q}_{k} & \end{bmatrix} A a set of 2''L''+1 sigma points is derived from the augmented state and covariance where L'' is the dimension of the augmented state. : The sigma points are propagated through the transition function ''f. \chi_{k|k-1}^{i} = f(\chi_{k-1|k-1}^{i}) \quad i = 0..2L The weighted sigma points are recombined to produce the predicted state and covariance. \hat{\textbf{x}}_{k|k-1} = \sum_{i=1}^N W_{s}^{i} \chi_{k|k-1}^{i} \hat{\textbf{P}}_{k|k-1} = \sum_{i=1}^N W_{c}^{i}\ k-1}^{i} - \hat{\textbf{x}}_{k|k-1} k-1}^{i} - \hat{\textbf{x}}_{k|k-1}^{T} Where the weights for the state and covariance are given are: W_{s}^{0} = \frac{\lambda}{L+\lambda} W_{c}^{0} = \frac{\lambda}{L+\lambda} + (1 - \alpha^2 + \beta) W_{s}^{i} = W_{c}^{i} = \frac{1}{2(L+\lambda)} \lambda = \alpha^2 / (L+\kappa) - L \,\! Typical values for \alpha , \beta , and \kappa are 10^{-3} , 2 and 0 respectively. (These values should suffice for most purposes.) Update The predicted state and covariance are augmented as before, except now with the mean and covariance of the measurement noise. \textbf{x}_{k|k-1}^{a} = [ \hat{\textbf{x}}_{k|k-1}^{T} \quad E\textbf{v}_{k}^{T} \ ]^{T} \textbf{P}_{k|k-1}^{a} = \begin{bmatrix} & \hat{\textbf{P}}_{k|k-1} & & 0 & \\ & 0 & &\textbf{R}_{k} & \end{bmatrix} As before, a set of 2''L''+1 sigma points is derived from the augmented state and covariance where L'' is the dimension of the augmented state. : Alternatively if the UKF prediction has been used the sigma points themselves can be augmented along the following lines \chi_{k|k-1} := [ \chi_{k|k-1} \quad E\textbf{v}_{k}^{T} \ ]^{T} \pm \sqrt{ (L + \lambda) \textbf{R}_{k}^{a} } where \textbf{R}_{k}^{a} = \begin{bmatrix} & 0 & & 0 & \\ & 0 & &\textbf{R}_{k} & \end{bmatrix} The sigma points are projected through the observation function ''h. \gamma_{k}^{i} = h(\chi_{k|k-1}^{i}) \quad i = 0..2L The weighted sigma points are recombined to produce the predicted measurement and predicted measurement covariance. \hat{\textbf{z}}_{k} = \sum_{i=1}^N W_{s}^{i} \gamma_{k}^{i} \textbf{P}_{z_{k}z_{k}} = \sum_{i=1}^N W_{c}^{i}\ - \hat{\textbf{z}}_{k} - \hat{\textbf{z}}_{k}^{T} The state-measurement cross-correlation matrix, \textbf{P}_{x_{k}z_{k}} = \sum_{i=1}^N W_{c}^{i}\ k-1}^{i} - \hat{\textbf{x}}_{k|k-1} - \hat{\textbf{z}}_{k}^{T} is used to compute the UKF Kalman gain. K_{k} = \textbf{P}_{x_{k}z_{k}} \textbf{P}_{z_{k}z_{k}}^{-1} As with the Kalman filter, the updated state is the predicted state plus the innovation weighted by the Kalman gain, \hat{\textbf{x}}_{k|k} = \hat{\textbf{x}}_{k|k-1} + K_{k}( \textbf{z}_{k} - \hat{\textbf{z}}_{k} ) And the updated covariance is the predicted covariance, minus the predicted measurement covariance, weighted by the Kalman gain. \hat{\textbf{P}}_{k|k} = \hat{\textbf{P}}_{k|k} - K_{k} \textbf{P}_{z_{k}z_{k}} K_{k}^{T} Applications Inertial guidance system Autopilot Satellite navigation systems Simultaneous localization and mapping References Kalman, R. E. A New Approach to Linear Filtering and Prediction Problems, Transactions of the ASME - Journal of Basic Engineering Vol. 82: pp. 35-45 (1960) Kalman, R. E., Bucy R. S., New Results in Linear Filtering and Prediction Theory, Transactions of the ASME - Journal of Basic Engineering Vol. 83: pp. 95-107 (1961) JU97 Julier, Simon J. and Jeffery K. Uhlmann. A New Extension of the Kalman Filter to nonlinear Systems. In The Proceedings of AeroSense: The 11th International Symposium on Aerospace/Defense Sensing,Simulation and Controls, Multi Sensor Fusion, Tracking and Resource Management II, SPIE, 1997. See also Compare with: Wiener filter, and the multimodal Particle filter estimator. External links [http://www.negenborn.net/kal_loc/ Kalman Filters, thorough introduction to several types, together with applications to Robot Localization] The Kalman Filter Kalman Filtering Kalman filters Fast Kalman Filter (FKF) - the invention of Dr. Antti Lange Category:Techniques